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INTRODUCTION ANDODIECLIVES 


A. GENERAL 

Marine fog is a hazard to shipping and to low-level flying over the open ocean 
and coastal waters. Wheeler and Leipper (1974) cited human and other losses to the 
United States Navy due to poor visibility caused by fog. On the other hand, marine 
fog can camouflage the location and motion of surface shipping. This works both 
ways; it helps protect friendly forces from discovery but makes it more difficult to seek 
out and destroy enemy forces. Since the Strategic Air Command 1s responsible for 
interdicting enemy sea power through air operations and for conducting antisubmarine 
warfare and aerial minelaying operations, forecasting marine fog 1s of great importance 


to the U.S. Air Force as well as to the U.S. Navy. 


B. ROLE OF THE NAVAL POSTGRADUATE SCHOOL 

In recent years, the Department of Meteorology, Naval Postgraduate School 
(NPS) has been studying the climatology of marine fog and marine visibility (Renard, 
Englebretson and Daughenbaugh, 1975; Willms, 1975; Renard, 1976). Ilowever, the 
data network is not widespread enough over the oceans to generate sufficient informa- 
tion to analyze the initial visibility/fog conditions as a prerequisite for forecasting 
marine fog on a day-to-day basis. 

Because of the difficulties of forecasting fog directly, NPS researchers began to 
use Model Output Statistics (MOS) to estimate marine visibility (and implicitly marine 
fog). This effort has been concentrated mostly on areas of the North Pacific Ocean 
(Koziara, Renard and Thompson, 1983; Renard and Thompson, 1984). Karl (1984), 
Diunizio (1984) and Elias (1985) worked with a number of climatologically homoge- 
neous areas of the northwest North Atlantic Ocean. These researchers tested various 
MOS prediction schemes, using predictors from the Fleet Numerical Oceanography 
Centers (FNOC) Navy Operational Global Atmospheric Prediction System 
(NOGAPS). Fatjo (1986) tested these same schemes on controlled (simulated) data 


Sets. 


m= RESULTS FO DATE 
Results so far suggest the problem of forecasting marine visibility is even more 


intractable than previously supposed. Because of the difficulties of forecasting visibility 


using more than two categories, Karl and Diunizio recommended switching to a two- 
category visibility forecasting scheme. Also, Diunizio recommended including derived 
predictors as well as direct model predictors; a derived predictor is a mathematical 
combination of two or more direct model predictors. Fatjo found that, in general, a 
good predictor can overcome gross data defects, suggesting it would be more profitable 
to improve the quality of a few key predictors than undertake a costly and massive 
upgrade to the data network. Fatjo also showed that the degree of statistical separa- 
tion between data sets is of the utmost importance in statistical forecasting using fitted 
distributions. 


As a result, Lowe! 


recommended a study of the statistical attributes of the 
predictor data sets. He also expressed great unease at assuming the underlying distri- 
bution for all predictors to be Normal, as has hitherto been the case. While seeing the 
attraction of assuming a common distribution for each predictor (rather than having to 
fit each one individually), Lowe questioned whether there might be a more generally 


applicable distribution than the Normal. 


DD.) PURPOSE OF THIS S1UDx 

Ultimately, it became clear that learning more about the statistical distribution of 
model output predictors vis-a-vis the occurrence of Fog and No Fog is a necessary 
ingredient in a MOS forecasting scheme for marine fog. In particular, it was felt useful 
to know for which predictors the Beta distribution may be safely substituted for the 
Normal distribution. 

Accordingly, this study presents the results of an investigation into the distribu- 
tional character of certain model output parameters which could be regarded as poten- 
tial marine fog predictors. In the process, three different distributions (Beta, Normal 
and Gamma) are compared, although the emphasis is on comparing the Beta and 


Normal distributions. 


E. SUMMARY OF STEPS TO BEVTAREN 
First, determine the most likely NOGAPS predictors which might serve as fog 


predictors. 


Ip. R. Lowe is_a senior scientist at the Naval linvironmental Prediction Research 
Facility, Monterey, CA. He is also Co-Advisor to this study. 
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Second, establish a working data base, consisting of observed data matched with 
NOGAPS predictors. 

Third, compute the Beta, Normal and Gamma distributional forms of the 
predictors, along with the relevant statistical parameters of these distributions. 

Fourth, graphically fit each predictor population to the Beta, Normal and 
Gamma distributions. 

Fifth, for each distribution, apply Bayes’ Law of Inverse Probability to diagnose 
(i.e.” predict”) the occurrence of Fog or No Fog. 

Sixth, determine for which predictors the Beta distribution is competitive with the 
Normal. 

Seventh, compare the Gamma distribution with the Beta and Normal 


distributions. 


Il. MODEL OUTPUT STARIST is 


A. GENERAL 

A Model Output Statistic (MOS) is a statistically-developed method which fore- 
casts a weather element of interest as a function of forecast variables available from a 
numerical weather prediction model. While some common meteorological variables are 
directly forecasted by the numerical model, like pressure and temperature, there are 
some important exceptions, such as fog and visibility. The Navy’s MOS program is 
being developed at the Naval Environmental Prediction Research Facility (NEPRF) to 
fill in that gap by producing forecasts of the elements not forecasted by the numerical 
model. The MOS procedure 1s also used to refine and tailor numerical model forecasts 


to account for model errors and sub-synoptic scale influences. 


B. MODEL OUTPUT PREDICTORS 

Meteorological variables directly forecasted by the numerical model are usually 
called model output predictors (MOP). A MOP is the model's “best guess” of the 
value of that variable at a particular point in space and time. 

The set of predictor values for a variety of meteorological variables may be 
imagined as an array of numbers, defined for that point in space and time. 
Considering the number of geographical points and forecast periods over which the 
numerical model produces arrays at a given time, there’s a massive volume of data 
involved. Portions of these data are usually archived at weather centers. 

In theory, if each numerical-model predictor array were perfectly accurate, it 
would predict the exact state of the atmosphere at a given time and locus. A statistical 
analysis of both the predictors and observations for that point in space, over a suitably 
long period of record, would be expected to show at least one predictor having a 
markedly different distribution of values between two events, such as Fog and No Fog. 
Thus, it’s concluded that this predictor (or group of predictors) “makes the difference” 
between whether Fog 1s present or absent. 

As a crude example, assume this predictor to be the relative humidity (RII). It 
might be that Fog occurs always and only when REI 1s, say, 90% or more, and, 
conversely, No Fog occurs always and only when the RH 1s 89% or less. The search 


for a perfect predictor of Fog, for that point in space, would be over. 


C. MODEL OUTPUT PREDICTORS AND OBSERVED ELEMENTS 

Numerical model predictions are usually made and disseminated twice daily. 
Glahn and Lowry (1972) recognized that these model output predictors, together with 
the corresponding archived weather observations, collectively represent a wealth of 
valuable information. If a certain predictor value (or range of values) could be shown 
to be consistently related to the value (or range of values) of a certain observed vari- 
able, an association between the two could be established which, hopefully, would 
hold true in the future. 

As a result of Glahn’s and Lowry’s work, the Technique Development 
Laboratory of the National Weather Service began working on prediction equations 
that would establish statistical relationships between model output predictors and 
various weather elements of interest, known as predictands. The statistical relation- 


ships are determined by multiple linear regression” (Glahn, 1983). 


D. SOME LIMITATIONS TO MOS FORECASTING 

The basic assumption in MOS forecasting is that some relationship between the 
predictor and predictand, established from historical data, 1s valid under simular 
circumstances in the future. Therefore, the quality of the predictand data 1s crucial. 
These data consist of weather observations taken under widely varying circumstances. 
They play two roles: they serve as initial data for the numerical model, and they verify 
(where feasible) the accuracy of the predictions (1.e. MOS predictors). 

Unfortunately, the raw observations initializing the numerical model don’t come 
close enough to capturing the true state of the atmosphere. For forecasting over the 
ocean, there aren’t enough unique observations regularly positioned over an area of 
interest, leaving large areas with data gaps. Except for some scattered weather buoys, 
observation platforms (i.e. ships) report from different locations as they travel, often at 
irregular intervals. Observers’ skills vary widely, from those of a trained meteorologist 
to an Ordinary crewman with minimal experience. Of special importance to this study, 
horizontal visibility reports at sea are hampered by lack of visibility markers such as 
are usually present on land. The weather transmission code limits the degree of detail 
that a reported observation may include. The transmission network from ship to 


weather center can delay, garble and lose observation data. 


24 greatly modified form of Regression Estimation of Event Probability, (REEP), 
according to P. R. Lowe of the Naval Environmental Prediction Research Facility. 


FS 


The numerical model is a discrete-point representation of a spatial-temporal 
continuum. Consequently, it has to cut mathematical corners in order to reduce the 
complexity of the calculations and to meet forecast deadlines. Also, there still exists an 
imperfect and incomplete understanding of the complex dynamic processes of the 
atmosphere, especially on the smaller scales of time and space. 

Even if the observation limitations were corrected and the numerical model was 
made more sophisticated, and thus more accurate in its predictions of atmospheric 
variables, the basic problem of an imperfect match of predictors and predictands still 
remains. The relationships between the two are generally very complex and highly 
nonlinear, and MOS prediction equations can only attempt to capture the complex 


feedback mechanisms between the different atmospheric variables. 


E. NOGAPS MOS DATA 

In 1983, the U.S. Navy began developing a MOS computer program to forecast 
horizontal visibility at sea, using the FNOC NOGAPS model. NOGAPS produces 
predictor values for a variety of meteorological variables at global grid-points, spaced 
at intervals of approximately 2.5° latitude and longitude. These are the raw material 
from which MOS forecasts are produced. The data have been archived at FNOC and 


were made available for this study. 
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Hi. STATISTICAL OVERVIEW 


A. THE CONCEPT OF SEPARATION 

One of the attractions of statistics is its ability to extract valuable, hidden infor- 
mation from a mass of raw data. For example, suppose there are n observations each 
of Fog and No Fog for an ocean location. A plot of the occurrence of Fog versus 
Predictor A might result in a distribution like Fig. 3.la, with values of 290 occurring 
most frequently, and with other values less frequently on each side of the mode. A 
similar plot for No Fog might look like Fig. 3.1b, with a peak at 297. When these two 
figures are combined into one (Fig. 3.Ic), the relative frequencies can be readily 
compared. 

From Fig. 3.lc, it follows that a predictor value of T1 has a P1 likelihood of 
being associated with Fog, and a Q1 likelihood with No Fog. Since P1 and Q1 are 
about equal, it can be seen that a T1 value of Predictor A has about the same chance 
of happening in either case and 1s thus of little associative use. In other words, given a 
value of T1, valid for a particular point in time and space, and associating (1.e. fore- 
casting) T1 at that time with, say, Fog, a correct forecast would be expected slightly 
more than half the time, given repeated forecasts over the long term. This is because 
P1 is only slightly greater than Q1. Since this 1s little better than tossing a coin, this 
value of T is not a very skillful predictor. 

However, a value of Tz, with likelihoods Pz and Qz of Fog and No Fog, respec- 
tively, would be an excellent associative tool; a forecast of the event with the higher 
likelihood would be expected to be right much more than half the time. This 1s 
because Q2 is much greater than Pz. 

In this example, since most values of Predictor A exhibit a big difference between 
their respective P and Q values, this translates into the ability to make a correct fore- 
cast most of the time. 

Clearly, the attribute of the data that enables us to make such a confident fore- 
cast 1S an important one; it might be described as the ability to discriminate between 
different events with an acceptable degree of confidence. It will be referred to here as 
separation. 

Separation is perhaps best illustrated when it’s absent altogether. If it was found 


that the distribution of Predictor A was exactly the same for both Fog and No Fog, it 
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would be concluded that this predictor had the same range and frequency of values in 
both cases...that 1t couldn’t distinguish between the two events. Hence, it would show 
zero skill as a predictor. Fig. 3.1d is an example where the frequency distributions are 
so close as to be almost indistinguishable. 

Separation may be achieved in a number of different ways. The type of distribu- 
tion itself, and the mean and standard deviation (hereafter referred to as sigma), all 
play an important role. A quantitative measure of separation will be defined later in 
this study; meanwhile, the following examples illustrate some of the types of separation 
that are possible: 


1. Given a common distributional form, roughly bell-shaped with common means, 
the respective sigmas determine the separation. In Figs. 3,2a and 3.2b, the 
sigmas are small, but the bigger difference between them in Fig..3.2b makes for 
better sepia d than in Fig. 3.2a. In Figs. 3.2c and 3.2d, the sigmas are larger 
(Note: Figs. 3.3 and 3,4 are scaled differently); again, the separation is better in 
the case of the larger sigma difference. 


2. Given a common distributional form, roughly bell-shaped but with different 
means, the picture becomes more complicated, with the two means and two 
sigmas influencing the separation. Fig. 3.3a illustrates the statistician’s dream of 
two populations. that are mutually exclusive. Ihe sigmas are both small 
compared to the intermean distance. Fig. 3.3b depicts the intermean distance as 
large compared to one of the sigmas. The common area beneath the two curves 
is relatively small, which, of course, is one of the ways separation can occur. Pe 
3.3c shows a_small intermean distance compared to both sigmas, meaning little 
separation. Finally, Fig. 3.3d’s intermean, distance is small compared to one of 
the sigmas; some of the separation 1s attributable to the relatively large area of 
mutually exclusive events at the_tails of the flatter. curve, while the rest of the 
separation is due to the sharp difference in the relative frequencies in the vicinity 
of the means of each population. This case 1s merely a variation of Figs. 3 
and 3.2d, where the means are equal. 


3. Given one or more non-bell-shaped distributions, separation may also be attal- 
nable. Fig. 3.4a shows a normal and exponential distribution, but with a 
common mean and sigma. Despite that, there’s excellent separation in the area 
of the eee exponential density and moderate separation toward the right-hand 
side, of the graph he CHOSE SHY? problem is acute only at and near the. inter- 
section of the curves. Fig. 3.4b depicts two exponential distributions: again, the 
“forecasting” task is dubious only over_a narrow range of data values, which 1s 
most desirable. Finally, Figs 3.4¢ and 3.4d, which, look familar, from the above 
examples, are actually manifestations of the Beta distribution. lig, 3.4¢ is of the 
general form o rg 3.4a, while ee 3.4d looks just like the distributions in the 
ee of the Predictor A in Fig. 3.lc. Incidentally, there’s excellent separation 
in both cases. 


The chameleonic property of the Beta distribution, illustrated in the last example, 


is part of the subject of this thesis, and will be looked at again later. 


B. QUANTIFYING SEPARATION 
From the above discussion, it can be seen that the statistical separation of two 
data sets is a function of the respective means and sigmas, themselves functions of the 


distribution of the data. To quantify separation, a number called the Signal-to-Noise 


Ratio (S/N) has been developed by Lowe? where 
S/N = (6p)?(Na + Nz —2)/(Nio12 + Nzoz’). 


S/N is a measure of the intermean distance Off, modified by the sizes (Ni, Nz) and 
standard deviations (61, 62) of the respective populations. 

However, S/N doesn’t tell the whole story, as a glance at Fig. 3.4a shows. Even 
though the means and sigmas are equal, there’s still significant separation between the 
data sets, and forecasting one or the other of the two events has good prospects of 
success, especially for lower values of the data. Hence, an important caveat to the use of 
the S/N is that the data sets themselves must be examined to determine their character- 
istic distributions. If the characteristic distributions are dissimilar, this fact itself often 


overrides whatever indications the S/N might give. 


C. REAL DATA 

Of course, real-world data isn’t as clear-cut as the above examples, and separa- 
tion is sometimes an elusive goal. For one thing, such data may be only roughlv fitted 
to its “characteristic” distribution, with a poor “goodness-of-fit” at times. Also, real 
data are often contaminated; for example, outlying values which may be physically 
unrealistic may nevertheless creep in due to data processing problems. In short, the 
real world data are considerably more messy, and occasionally of little forecasting use. 

In the case of output from a numerical weather prediction model, such as MOS 
predictors, even the “best” predictors can be expected to have a fair degree of impreci- 
sion. After all, these are only as good as the raw input data, and subject to mathe- 
matical and physical simplifications. By finding out which predictor works best for a 
given meteorological phenomenon like Fog, including learning as much as possible 
about that predictor’s statistical distribution, resources can be concentrated on 
improving the model's ability to forecast with that particular predictor. 

In this study, 59 NOGAPS predictors were available, a number of which were 
combined to form derived predictors. These are listed in Appendix C. Since it was 
necessary to find some way to reduce this number to a more manageable one, a mix of 
statistical and meteorological insights was used to arrive at a short list of candidate 


variables. 


3p, R. Lowe, Naval Environmental Prediction Research Facility, Monterey, CA. 


19 


D. PREDICTOR SELLE Giie |. 

Statistical insights led to computing the S/N for each predictor, and rank 
ordering them by S/N value. Concurrently, histograms of each predictor were plotted 
to estimate and compare the overall shape of the predictor’s distributions for Fog and 
No Fog. Those predictors whose S/N ratios were less than 0.5 and with broadly 
similar distributions for both populations were eliminated from further consideration. 

Candidate predictors were then examined from a meteorological perspective to 
see if they made sense physically. As expected, those predictors at the top of the list 
were related in some way to the marine atmospheric boundary layer or air-sea inter- 
face. However, it 1s surprising that the derived predictors Surface Relative Humidity, 
Surface Air Temperature Advection and Sea-surface Temperature Advection showed 
no promise as statistical predictors. These advective quantities were computed using 
numerical model wind and temperature data. 

Time constraints restricted detailed examination to the following six predictors: 

1. Surface Moisture Flux (SMF) 

S/N: 1.006. This is equivalent to Evaporative Moisture Flux, which Koziara 
et al (1983) found to be the best parameter for predicting marine fog over the North 
Pacific Ocean. A downward flux results when a moist airmass 1s cooled to saturation 
at the sea surface, setting up a favorable condition for fog formation and maintenance. 

20 °1025 = SSC DP) 

S/N: 1.510. This is defined as the difference between the air temperature at 
925mb and the sea-surface temperature, which Diunizio (1984) recommended as a 
prospective predictor. The 925mb temperature gives a better S/N than does the surface 
air temperature. Marine fog is usually associated with a negative difference between 
these quantities. The conjunction of relatively cold air and warm water over the Gulf 
Stream in the location examined in this study would be expected to amplify this differ- 
ence: 

3. Entrainment (ENT) 

S/N: 0.777. The degree of turbulent mixing in the marine boundary layer 
should be reflected in this predictor. Barker (1975) states that entrainment from the 
inversion layer into the boundary layer is an important source of heat and drier air, 
effectively acting to retard fog formation. 

4. Long-wave Radiation (LWR) 


20 


S/N: 0.732. Fog tends to trap outgoing long-wave radiation from the surface, 
and reradiate some of it back down. Thus, the Fog and No Fog regimes might be 
expected to show differences in LWR profiles, with the incidence of Fog negatively 
correlated with LWR. Renard and Thompson (1984) found infrared extinction param- 
eters to be useful in their work on visibility over the North Pacific Ocean. 

5. Sensible Heat Flux (SHF) 

S/N: 0.816. Mack et al (1983) link a downward heat flux with marine fog 
formation. A downward flux reflects the movement of warmer low-level air over a 
colder ocean surface, a condition necessary for advection fog over the ocean. A link 
was also shown between a slight upward heat flux with the maintenance of fog pole- 
ward of the area of maximum sea-surface temperature gradient. 

6. Stratus Frequency (STF) 

S/N: 0.626. Pilie et al (1979) linked the occurrence of stratus with coastal fog 

off California. This is not surprising since fog is surface-based stratus, and the 


frequency of the two would be expected to be related. 
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IV. DATA 


A. AREA 
The area of study was confined to a region in the North Atlantic Ocean, one of a 


4 This area, Off the coast of 


number of climatologically homogeneous regions. 
Newfoundland, has a relatively high occurrence of marine fog compared to the other 
regions in the North Atlantic Ocean. (Renard, 1980). This region is identified in Fig. 


4.1. 


B. TIME 

Data from the months of June, July and August, 1984 and 1985, were meshed 
into a single data set. The only synoptic ship reports used were those at 1200 GMT, 
since this is a daylight hour over the region. It was felt that synoptic data at 0000 
GMT, under fading light conditions at best, would not be as reliable for the detection 


of fog. 


C. RAW VERIFICATION DATA SET 

A “raw” data set was compiled by the Naval Oceanography Center Detachment, 
co-located with the National Climatic Data Center (NCDC), in Asheville, NC. The 
data set consists of all ship synoptic observations for this study region and period on 
file at NCDC. 

Each report has been graded by NCDC for accuracy and consistency, from the 
temporal, spatial and meteorological perspectives. Flags have been inserted in the data 


as an indicator of reliability and to alert the user to questionable reports. 


D. ~ REFINED VERIFICATION DA RA SET 

To refine the raw verification data set, certain deletions became necessary. 
Reports whose geographical location had been questioned by NCDC were discarded. 
Reports which provided no reliable evidence of either the presence or absence of fog 
were deleted. An example would be a buoy, which might report only a sea-surface 
temperature. Also deleted was one of multiple observations from the same location 


and time. The observation retained was the one deemed most reliable by NCDC. In 


Sas determined by P. R. Lowe of the Naval Environmental Prediction Research 
Facility, Monterey, CA. 
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addition, reports close to shore were eliminated; these are susceptible to contanunation 
from land-based predictor values during subsequent interpolation of such values to 
near-shore ship observation locations. 

Once these deletions were made, the data set was divided into two parts corre- 


sponding to Fog/No Fog. The criteria for this division are listed in Appendix B. 


E. MODEL OUTPUT PREDICTORS 
While the verification data set consists of surface observations at scattered points 
over the area in question, NOGAPS predictions are made for grid points. The 
following steps were taken to prepare these data for use: 
1. Interpolation to Observation Points 
The predictor values were interpolated from grid points to ship observation 
points, using a bilinear interpolation technique. Thus, the predictor data set was 
reduced in length so as to be equal to the length of the accepted surface observations 
data set. 
2. Predictor Subsets 
The data set was divided into subsets corresponding to each predictor; the size 
of each predictor subset depends on the number of missing predictor values; if there 
weren t any, the subset length equals that of the overall predictor data set. 
3. Fog/No Fog Subsets 
Each predictor subset was further divided in two, based on whether the asso- 


ciated verification observation indicated Fog or No Fog. 
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V. MODUS OPERANDI 


A. PREPARATORY 

Once the predictor data were divided into subsets corresponding to the Fog and 
No Fog populations of each one, as described earlier, the S/N test was conducted on 
each predictor data set. A short list of candidate predictors was then drawn up, 


forming the basis of all further work tn this study. 


B. COMPUTER PROGRAM 

For each candidate predictor, a computer program program first randomly splits 
in half the two candidate predictor populations, Fog and No Fog, forming a training 
and a testing set. Once the data are split, the mean (jt) and sigma (o) of the Fog and 
No Fog training sets are used to compute bounds to the range of the data, A and B, 


defined as follows: 
A = MIN (p17 301, 12-7302), and 
B= MAX (ft1+301, pft2+362). 


Data values outside these bounds were discarded tn order to eliminate the undue influ- 
ence that such outlying values exert when trying to fit a distribution to the remaining 
values. Thus, the maximum and minimum values of the remaining data became the 


range over which the distributions were fitted. 


© 0 LRAINING SET 

The training set was used to generate statistics that are assumed to characterize 
the population as a whole. This assumption was based on visual comparisons of the 
empirical distributions, as discussed below. 

1. Empirical Distribution of Training Data set 

Both the Fog and No Fog training sets were inserted into Grafstat, an inter- 

active Statistics package, which is described in more detail in Appendix A. In Grafstat, 
an empirical plot of each data set was made on the same graph. This served as a check 
on the randomness of the splitting procedure, since one would expect to see the same 
overall pattern as with the entire data set. This also gave an idea of the overall shape 


of the distribution. Finally, the plot showed at a glance the separation of the two 
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populations (or lack thereof). Empirical plots of the training sets of the six predictors 
are shown in Fig. 4.2. 
2. Fitting Distributions 
While in Grafstat, the training populations were fitted to the Beta, Gamma 
and Normal distributions, using the Method of Moments procedure. This method was 
chosen over the Maximum Likelihood method due to the difficulty of adapting the 
latter to a Fortran computer program. These fits are shown in Appendix D. 
3. Using Distribution Parameters 
The main computer program used in this study computed the statistical 
parameters of the Beta, Gamma and Normal distributions. The accuracy of these was 
checked against corresponding values generated by Grafstat. 
4. Output from the Training Sets 
The statistics gleaned from the training set would be applied later to samples 
drawn from the testing set, in the hope of distinguishing the Fog from the No Fog 
cases in the latter. These statistics are the parameters for the Beta, Normal and 


Gamma distributions, a, B, ft, o, A, and n respectively. 


D. TESMENG SET 

Ten samples were randomly drawn from each population of the testing set, 10% 
in length. This preserved the same relative frequency between Fog and No Fog cases 
as in the original data sets. For each value in each sample, the goal was to determine 
the probability of its belonging to one or other of the populations. Since the two 
populations were of different sizes, with different prior probabilities, recourse was made 
to Bayes’ Law. 

1. Bayes’ Law 

The forecast method involved applying Bayes’ Law to each of the sample 

predictor values. This law takes into account the prior, unconditional probabilities of 
each event. For the purposes of this study, the prior probabilities were defined by the 
relative sizes of the two populations (approximately 65% and 35% for No Fog and 
Fog respectively). Bayes’ Law for the conditional probability of the event Fog, given 


the occurrence of a predictor value A, may be stated as follows: 


REG 2) =<stCA| Pos) 
P(Fog)|A = 
P(Fog) X f(A|Fog) + P(No Fog) * f(A|No Fog) 


Me) 


where P(Fog), P(No Fog) are the unconditional (prior) probabilities, based on the 
relative sizes of the two training populations; and f(A|Fog), f(A|No Fog) are the class 
conditional likelihoods that a value A will be associated with Fog or No Fog respec- 
tively. These likelihoods are computed using the statistical parameters obtained from 
the training set, applied to the three distributions being examined. An analogous 
equation may be written for P(No Fog|A), the conditional probability of No Fog. 
2. Scoring the Forecasts 

Whenever Bayes’ Law computed a probability of 250% that a predictor 
value, known to be from a certain population, indeed came from that population, it 
was considered a hit. A tally was kept of the number of hits and misses per event, 
sample and distribution. Contingency tables were generated for each sample and a 
number of different skill scores were computed. These scores are defined in Chapter 
VE 
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Mims GORING AND TESTING FOR SIGNIFICANCE 


A. GENERAL 

The overall goal was to determine if the Beta distribution could serve as a proxy 
for the normal distribution for certain meteorological predictors produced by the 
NOGAPS model. In the process, the Gamma distribution was also tested. Using the 
Statistical parameters of each distribution, forecasts were made and the results 
compared for significant differences. For each predictor, three two-way comparisons 


were made...Beta-Normal, Beta-Gamma and Normal-Gamma. 


B. SCORES COMPUTED 

Each sample drawn from a testing set represents one set of forecasts. For each 
forecast, five percentage scores were derived from contingency tables. These scores are 
defined as follows (an incorrect forecast of Fog means Fog was forecasted but not 
observed): 


1. Threat Score (TS): Number,of correct forecasts of Fog divided by the sum of all 
observations of Fog and all incorrect forecasts of Fog. 


2. Percentage Correct (PC): Number of correct Fog and No Fog forecasts divided 
by all forecasts. 


3. Power of Detection (PD): Number of correct forecasts of Fog divided by all 
observations of Fog. 


4. Forecast Reliability (FR): Number of correct forecasts of Fog divided by all 
forecasts of Fog. 


5. False Alarm Rate (FA): Number of incorrect forecasts of Fog divided by all 
observations of No Fog. 


These scores were also averaged over ten repetitive sample runs for an overall 
score. Except for the False Alarm score (FA), the higher the score the better the 


forecast. 


C. PENALTY-REWARD SCORE 
In addition to the contingency table scores, each forecast was scored using the 


» 


Penalty- Reward (PR) score.~ This score measures the skill of the probabilistic forecasts 


of two-category events, such as Fog/No Fog. 


S devised by P. R. Lowe of the Naval Environmental Prediction Research Facility, 
Monterey, CA 
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For this study, the PR score is defined for the dichotomous case of Fog/No Fog, 
where P(No Fog) is greater than P(Fog). A separate PR score is computed for each 
individual forecast made, and an overall PR score is calculated as the mean of the 
individual scores. Only the overall scores are shown in this study. 


The PR score is computed as follows: 


For P(Fog|A) < P(Fog), PR = (le-11(X— 1))(1 - Y)* 


For P(Fog|A) > P(Fog), PR = (11(X — 1)— Iz)((P(Fog|A) — P(Fog))/(1 — P(Fog)))? 


where 
1. P(Fogl|A), P(Fog) are as defined for Bayes’ Law 
2. Iz = Oif Fog occurs, = 1 if No Fog occurs 
3. I1 = 1 if Fog occurs, = 0 if No Fog occurs 
4. X = 1/P(Fog) 
5. Y = P(Fogl|A)/P(Fog) 


D. TESTING FOR SIGNIFICANCE 

A one-way Analysis of Variance (Anova) procedure, using ten samples, 1s used to 
test for significant differences between the forecast scores obtained by pairs of distribu- 
tions. The results for each two-way Anova comparison are given in Appendix E, 
Where AnovaBN, AnovaBG and AnovaGN refer to comparisons between Beta and 
Normal, Beta and Gamma, and Gamma and Normal respectively. The level of signifi- 
cance was set at 0.05. Values less than this number indicate a significant statistical 


difference exists between the two scores being compared. 


BE. JUDGING THE KESUEas 
Since the hypothesis to be validated is that the Beta distribution may be used as 

a proxy for the Normal distribution, this is equivalent to seeking a significant, negative 
difference between these two distributions. There are five possible significance profiles 
that could occur: 

1. Beta could be significantly better than Normal. 

2. Beta could be insignificantly better than Normal. 
3. Beta could be exactly the same as the Normal. 
4 


. Beta could be insignificantly worse than Normal. 
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5. Beta could be significantly worse than Normal. 
Of this list, only item 5 would nullify the basic hypothesis. Analogous comparisons 


between Beta and Gamma, and Gamma and Normal were also made. 


Zo 


Vile RESUS 


A. GENERAL 

For each predictor examined, three histograms of each population (Fog and No 
Fog) were generated using Grafstat. Pairs of the three distributions being examined 
were fitted to each histogram, allowing three two-way visual comparisons of the ability 
of these distributions to fit the data. In generating these histograms, the data were 
standardized between 0 and 1 for each population separately, since these are the limits 
of a Beta distribution. 

Tables 1 and 2 show the results of the forecasting procedure for each of the three 
distributions used; these are contained in the first three lines of each predictor segment. 
The next three lines show the Anova probability values (P-values); these are for three 
one-way analyses of different pairs of distributions, and are identified by the first initial 
of each member of the particular pair being examined. A P-value less than 0.05 was 
taken to indicate significant differences between scores. 

The last line ranks the three distributions in descending order of forecasting skill, 
using the first initial of each one. Where the Anova results show no significant differ- 
ence between all three, the initials are not separated, i.e. BNG. Significant differences 
are indicated by a dash between the distributions. For example, B-N-G indicates a 
significant difference between all three. BN-G indicates that Gamma is significantly 
worse than both Beta and Normal, while the latter are not significantly different from 
each other. A few cases occurred where the first and third ranked distributions were 
significantly different from each other but not from the second ranked one; this is indi- 
cated in the table. 

In making the forecasts, time and programming constraints dictated that the 
same distribution be used on the Fog and No Fog populations, as opposed to mixing 


the distributions by using the “best fit” ones for each population. 


B. SURFACE MOISTURE FLUX (SMF) 
The empirical distributions (Fig. 4.2) are somewhat bell-shaped, with a fair degree 
of visual separation between the populations. Separation is enhanced since the No 


Fog population 1s clearly more dispersed than the Fog population. 
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Looking at the histograms with the distribution fits (Fig. 7.1), the Gamma distri- 
bution fits both populations very well. For the Fog population, Gamma appears to 
capture the peak of the data a little better than the Beta, as well as to reflect the skew- 
ness more accurately. The Normal is clearly less able to fit these data, and, of course, 
is unable to reflect any skewness at all. For the No Fog case, there’s little difference 
between Beta and Normal. However, the Gamma distribution does better than either 
of the others both in showing the overall shape of the data and in capturing the skewed 
peak. Visually, then, the Gamma is the best fit, followed by the Beta and Normal in 
that order, for both populations. 

The forecasting scores show TS values between 0.426 and 0.492, comparable with 
those found in the North Pacific Ocean experiments referenced in Chapter I. The TS 
value is perhaps the most important score meteorologically, since it measures the 
ability of a predictor to forecast “threatening” events, such as fog. For TS, there is no 
significant difference between Beta and Normal, both of which are significantly better 
than Gamma. This is surprising, since the Gamma is the better fit visually. 

The PC score shows no significant difference between the distributions, with Beta 
slightly better than Normal. However, significant differences are present in each of the 
other scores, as seen from Table |. 

Only in the PD score is the Normal significantly better than the Beta. For the 
others, either there is no significant difference between these two or Beta 1s significantly 
better than Normal. The only reflection in the forecast scores of Gamma’s visually 
superior fit to both populations is its showing in FR and FA, where it is significantly 
better than the others. 


C. Toes - SST (TDF) 

The empirical distributions (Fig. 4.2) are again bell-shaped, with a fair degree of 
visual separation between the populations. 

Looking at the histograms with the distribution fits (Fig. 7.2), the Beta and 
Normal are equally good at fitting the data; however, the Gamma looks superior again 
as it did for SMF. 

As for SMF, the TS figures (0.430 to 0.486) are comparable to those found in the 
North Pacific ocean experiments referenced in Chapter I. For TDF, these values show 
no significant difference between Gamma and Normal, or between Normal and Beta, 


but indicate Gamma is significantly better than Beta. 
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The PC, PD and PR scores also show Gamma to be the best distribution for the 
Fog/No Fog forecasts, but only PD shows it being significantly better than the others. 
For FR, all three values have insignificant differences between them, while for FA, 


Gamma is significantly worse than the others. 


D. ENTRAINMENT (ENT) 

The empirical densities (Fig. 4.2) are quite different, with the Fog population 
skewed to the left while the No Fog population is more evenly dispersed across the 
spectrum. Accordingly, there’s good visual separation except for values in the vicinity 
of 8. 

Looking at the histograms with the distribution fits (Fig. 7.3), Gamma appears to 
fit the Fog population best, while Beta does an excellent job with the No Fog popula- 
tion. For this population, the Normal is the second best visual fit, since the Gamma 
seems to be forcing substantial skewness where little exists. 

The TS scores range from 0.346 to 0.403, somewhat less than for SMF and TDF. 
The Normal is significantly better than the others, perhaps a reflection of its worth as 
the second best fit for both populations. Normal is also significantly better than the 
others in the PD score. 

For PR, both Normal and Gamma are significantly better than Beta, while for 
FA, Beta and Gamma are significantly better than Normal. For FR and PC, there are 


no significant differences between the three distributions. 


E. LONG-WAVE RADIATION (LWR) 

The empirical densities (Fig. 4.2) indicate both populations are bell-shaped with 
good separation. 

Looking at the histograms with the distribution fits (Fig. 7.4), all three distribu- 
tions fit the data quite well in both populations. The Gamma and Beta capture the 
slight skewness in each population, with the Gamma best able to focus on the central 
peak. 

The TS scores range from 0.291 to 0.315, considerably lower than for SMF and 
TDF. There were no significant differences in the TS values; similarly for PC and FR. 

The Gamma is significantly better than the others in PR, while the Normal ts 
significantly worse than the others in FA. For PD, the Normal is significantly better 


than the Beta. 
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EF. SSENSIDER HEAT FLUX (SHF) 

The empirical densities (Fig. 4.2) indicate both populations are bell-shaped but 
with little separation. 

Looking at Fig. 7.5, the Gamma does a slightly better job of capturing the peak 
of each population; otherwise, there’s little difference between the three distributions. 
However, none of the distributions captures the data peaks very well. 

The TS values range from 0.283 to 0.314 with no significant difference between 
the three distributions. The rest of the scores also show no significant difference, 
except for FA, where the Gamma 1s better than the Beta, and the PR, where it is better 


than the Normal. 


G _SHRATUS FREQUENCY (STF) 

The empirical densities (Fig. 4.2) show the maximum densities of each population 
to be around zero, more so for No Fog than for Fog. 

From the histograms (Fig. 7.6), it is difficult to tell how good the separation is 
likely to be. The Beta does well in capturing the high relative frequencies around zero, 
with the Gamma doing less well. By contrast, the Normal is clearly a poorer fit to data 
like these, whose distribution resembles an exponential form. 

The TS values range from 0.304 to 0.467, with both Beta and Gamma doing very 
well compared to other reported values elsewhere. The Normal’s poor fit to the data 1s 
reflected in a rather low TS value. 

Except for FR, Gamma is clearly the best forecaster, with Beta and Normal 
ranked after it in that order. The TS, PD, FA and PR scores each show significant 
differences between all three distributions. In particular, it is of interest that STF is the 
only predictor examined whose PR scores show significant differences between all three 
distributions, with a rather large range of scores (0.105 to 0.226). 

STF is also the only predictor whose PC score showed any significant difference 
between distributions; in this case, Normal was significantly worse than the other two. 


Only FR shows no significant difference between any of the distributions. 
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VIII. CONCLUSIONS AND RECOMMENDATIONS 


A. CONCLUSIONS 

This study confirms that Knowledge of the underlying distributions of Fog and 
No Fog populations is important to the success of forecasting for these categories. In 
particular, it indicates that predictors whose distributions are significantly skewed defi- 
nitely will be better described by a Beta or Gamma function than by a Normal. The 
best example of such a predictor is the Stratus Frequency, whose likelihood values 
peak in the vicinity of 0 and then decrease sharply toward higher values. The Normal 
distribution does a poor job of representing this predictor, while the Beta and Gamma 
do quite well. 

For predictors whose distributions are roughly bell-shaped (all except STF), the 
results are less clear-cut, and their interpretation depends on which scoring system is 
being used. In general, there are fewer significant differences between the three distri- 
butions than in the case of STF. These are outlined below. 

If the Threat Score is considered the single most important index of forecasting 
skill, this study indicates there is no significant difference between the Beta and Normal 
distributions for the predictors SMF, TDF, LWR and SHF. The Normal 1s signifi- 
cantly better than the Beta for ENT, while for STF, the reverse is true. The Gamma 
has a significantly better TS than the other two for STF, while for TDF, it is signifi- 
cantly better than Beta only. Otherwise, the Gamma is comparable to one and/or 
other of the other two. 

For the Power of Detection, frequently considered a leading index of forecasting 
skill, there is no significant difference between Beta and Normal in the predictors TDF 
and SHF, while a significant difference exists for each of the other predictors. For 
these, the difference favors Beta only for STF, while it favors Normal for SMF, ENT 
and LWR. Gamma is significantly better than the others for TDF and STF and is 
significantly worse than them for SMF. For the other predictors, Gamma is not 
significantly different from one or both of the other two distributions. 

While there are indications that Beta is generally competitive with the Normal, 
there 1s a number of caveats to be applied: 

1. Since no forecasts were made using the “best fit” distributions of Fog and No 

Fog (as, explained in Chapter VII), it.is not clear how different the results of 


forecasting with such a combination of distributions would be from those shown 
here. In theory, such forecasts should do better. 
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2. The Maximum Likelihood method was not tried in this study; it would be inter- 
esting to compare the results of using this method with those used here, which 
employed the Method of Moments. In particular, mixing the methods between 
populations may slightly enhance the forecast skill. 


3. Since no single distribution was ee for all predictors, it is proble- 
matical how to interpret the results. [here is, as yet, no universally-accepted 
“best” scoring system. 

Except for STF, then, no definite conclusions can be reached on the basis of this 
study as to how much the goodness-of-fit of a distribution influences its ability to 
forecast. However, there is no conclusive evidence that Beta could not serve as a 
proxy for the Normal, subject to the comments and caveats above. Indeed, the same 
could be said for the Gamma, and it could well be asked whether this distribution 


might not do just as well as the Beta as a proxy for the Normal. 


B. RECOMMENDATIONS 
As a result of the work done in this study, the following ideas are offered to 
others doing future research in this general area: 


1. Other candidate’ distributions. such as _ the T-distribution, Weibull and 
noe should be added to the list of distributions. Some of these may be 
better able to capture the unique shapes of some predictor populations. 


2. The “best fit” combination. of distributions, mentioned earlier, should definitely 
be established for each predictor population and forecasts made accordingly. 


3. The Method of Moments and the Maximum Likelihood method. should be 
examined poy to see which one (if yy) is best suited to fitting a given 
predictor. This can be done efficiently using Grafstat. 


4. A means of determining which distribution is the “best fit” for a given predictor 
population should be definitely established. Grafstat gives a number of different 
best fit” indices; it would be useful to establish if any one of these is best suited 
to numerical model predictor data. 


5. A “best” scoring method should be defined for two-event forecasiins such as the 
forecasting done in this study. In particular, the PR score might be a suitable 
choice. In this study, PR was consistent. in that, in five of the six cases, it 

showed Gamma as the best forecasting distribution, which reflected Gamma’s 


generally superior way of representing the data visually on the histograms. 


6. Other candidate predictors should be investigated, especially derived predictors, 
with a view to improving on the above results. 


7. More research on the relative ae of statistical separation of the popula- 
tions and the goodness of a distributional fit should be performed. 
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APPENDIX A 
A NOTE ON GRAFSTAT 


Grafstat is a product of the International Business Machine Company (IBM), 
and was being tested out at the Naval Postgraduate School at the time of this study. 
If successful, it will eventually be marketed commercially. 

Grafstat 1s an APL (A Programming Language, used for statistics) system 
designed for interactive data plotting, data analysis, applied statistics, and customized 
graphics output. It has a full-screen, menu-driven interface, and contains a wide 
variety of graphics functions, a set of commonly-applied data analysis and statistical 
procedures, and utilities for cataloging full-screen responses, applications functions and 
data. 

Minimal familiarity with APL is needed to use Grafstat. Users with more exten- 
sive APL backgrounds can use APL expressions as Grafstat entries, as well as integrate 
interactively-developed full-screen responses in the user’s own APL functions. 

For this study, the most useful feature of Grafstat 1s its ability to evaluate a set 
of data, fit a particular probability distribution to it, estimate the parameters of the 
distribution, and calculate the goodness-of-fit statistics for that distribution. Any of 18 
distributions may be specified and the parameters of a distribution may be either speci- 
fied or estimated. 

Another feature used is the plot of a data set as a continuous curve, rather than 
as a histogram. This facilitated the rapid evaluation of each data set and reduced the 
guesswork involved in determining which candidate distributions could be ignored from 


the start. 
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ARRENDIX B 
VERIFICATION DATA USED IN THIS STUDY 


1. The raw verification data set consists of surface weather reports for the area 


and time of interest, made available by the National Climatic Data Center (NCDC). It 


was distilled into a refined verification data set containing only those observations for 


which the presence or absence of fog could be definitely established. The steps in this 


process are as follows: 


a. 
o 


Number of observations received from NCDC: 12378 


Less those deleted due to either inconsistent locations or multiple observations at 
the same time and place: 11303 observations remain. 


Less those containing no definite evidence of the presence or absence of fog, as 
defined in paragraph 2 below: 9551 observations remain. 


. Less those for which there were no model output parameters available for that 


time and location: 7945 observations remain. 


e. Less those located adjacent to the coastline: 5136 observations remain. 


f. These 5136 observations constitute the working data set. 


2. The refined data set was then divided into two categories, Fog and No Fog. 


Each observation was placed in the No Fog category unless it met one of the following 


criteria, in that order: 
a. Fog was reported in Present Weather (codes 10, 11, 12, 28 and 40 through 49). 


b. Fog was reported in past weather (code 4). 


Cc. 


Visibility was less than 10 kilometers (ship synoptic codes 90 through 96), except 
when U winds exceeded 30 knots or ( blowing phenomena were reported 
codes 30 through 39) or (3) haze or dust was reported (codes 4 and 6) or (4) any 
orm of moderate, heavy or frozen precipitation was reported (all codes greater 
than 59 other than codes 60, 61, 66, 80, and 91). 


The final breakdown was as follows: Fog: 1788; No Fog: 3348. 


3. Cases where visibility was 9 kilometers or less but no weather or obstruction 


to vision were reported presented a special problem. It was decided to accept the visi- 


bility as “truth”, and to assume an obstruction to vision was present. Shipboard 


observation skills vary widely, and it was felt that fog was present in a large enough 


proportion of these cases to justify retaining these reports en masse. The omission of 


an obstruction to vision in such cases could be due to an incomplete observation by 


the observer or to a transmission problem. In any case, the presence of restricted visi- 


bility was deemed a positive indicator of fog (subject to the above exceptions), unless 


there was explicit evidence to the contrary. 
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APPENDIX C 


PREDICTORS 
A. NOGAPS Output Predictors 
SIVER cree Surface moisture flux 
Ne creenc Entrainment at top of marine boundary-layer 
ol gH one Sensible heat flux 
CET Pees. Total heat flux 
SRA css case Solar radiation at surface 
S Tile ees Percentage frequency of Stratus 


DIO00s...- 1000 mb geopotential D-value 
D925 ee 925 mb geopotential D-value 
Bss0-2 850 mb geopotential D-value 
DOO aes: 700 mb geopotential D-value 
5005 a0... 500 mb geopotential D-value 
D400......... 400 mb geopotential D-value 
E00 Re ce 300 mb geopotential D-value 
D250 250 mb geopotential D-value 
SS Ei Sea-surface temperature 
TR eo Surface air temperature 
TOO vax 1000 mb temperature 

2 ee cera: 925 mb temperature 

le OOeaeares 700 mb temperature 

SOOT is ce6 500 mb temperature 

AOO Ran 400 mb temperature 

SOO rcaresee. 300 mb temperature 

2 Obes 250 mb temperature 

BAIR ota Surface vapor pressure 
POOO e555. 1000 mb vapor pressure 

| Bhs 2s Sen ne 925 mb vapor pressure 

ES 0s ig ae 850 mb vapor pressure 
BIQOnecx 700 mb vapor pressure 
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ESU0c...cs 500 mb vapor pressure 
EW cer. ie. Boundary layer zonal wind component 
PlOOC rr... 1000 mb zonal wind component 


O25. ie. 925 mb zonal wind component 
WS50i-2=.. 850 mb zonal wind component 

MOOR s..-e. 700 mb zonal wind component 
Useere .c; 500 mb zonal wind component 
US007.:...: 400 mb zonal wind component 

WOO. cer. 300 mb zonal wind component 
U220.,....... 250 mb zonal wind component 

NV BIEN 2,c<2?. Boundary layer meridional wind component 
V1000........ 1000 mb meridional wind component 
IV Do oercc snus. 925 mb meridional wind component 
65510) eee 850 mb meridional wind component 
VIO. .3 33: 700 mb meridional wind component 
WV SUO Ni. 500 mb meridional wind component 
V400......... 400 mb meridional wind component 
V300%........ 300 mb meridional wind component 
OO acdanes 250 mb meridional wind component 


OR 2 5. ccc. 925 mb vorticity 

VORSOO....... 500 mb vorticity 

Brea 5460+ Surface pressure 

Peele ls: 3... Planetary boundary-layer depth 
Sobol PTH....<.- Stratus thickness 

PINT. cissoss Surface drag coefficient 


B. Derived Predictors 


IW AR. 2. 22 Long-wave radiation 

MUD Pees... Difference between 925mb air temperature and SST 
BS ests 255%2 Difference between surface air temperature and SST 
BNI Eel arcvs cers Surface relative humidity 

APA ......... Surface air temperature advection 

BADD <i... Sea-surface temperature advection 
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Fig. 3.1 Predictor A Distributions, Fog’No Fog. 
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Fig. 3.2. Common Means, Different Sigmas. 
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Fig. 3.4 Normal, Exponential and Beta Distributions. 
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